January 19, 2021 Tidy Tuesday submission.
A quick demonstration of affline transformations with the Kenya Census
Thanks as always to the TidyTuesday crew at R4DS.
The data this week comes from rKenyaCensus courtesy of Shelmith Kariuki. Shelmith wrote about these datasets on her blog.
rKenyaCensus is an R package that contains the 2019 Kenya Population and Housing Census results. The results were released by the Kenya National Bureau of Statistics in February 2020, and published in four different pdf files (Volume 1 - Volume 4).
The 2019 Kenya Population and Housing Census was the eighth to be conducted in Kenya since 1948 and was conducted from the night of 24th/25th to 31st August 2019. Kenya leveraged on technology to capture data during cartographic mapping, enumeration and data transmission, making the 2019 Census the first paperless census to be conducted in Kenya
Additional details about Kenya can be found on Wikipedia.
Kenya, officially the Republic of Kenya (Swahili: Jamhuri ya Kenya), is a country in Eastern Africa. At 580,367 square kilometres (224,081 sq mi), Kenya is the world’s 48th largest country by total area. With a population of more than 47.6 million people in the 2019 census, Kenya is the 29th most populous country. Kenya’s capital and largest city is Nairobi, while its oldest city and first capital is the coastal city of Mombasa.
tidyverse tidytuesdayR here ggplot2 ggthemes
TRUE TRUE TRUE TRUE TRUE
rKenyaCensus sf gganimate
FALSE TRUE FALSE
Dowload the weekly data and make available in the tt
object. Use the rKenyaCensus to do so, adding the polygons for each county.
# grab 3 tables of interest
crops <- rKenyaCensus::V4_T2.21
gender <- rKenyaCensus::V1_T2.2
households <- rKenyaCensus::V1_T2.3
counties <- rKenyaCensus::KenyaCounties_SHP %>%
st_as_sf %>% st_transform(4326)
# write them out
households %>%
write_csv("households.csv")
gender %>%
write_csv("gender.csv")
crops %>%
write_csv("crops.csv")
st_write(counties, "counties.geojson", delete_dsn = T)
After saving, load from the directory directly:
households <- read_csv("households.csv")
gender <- read_csv("gender.csv")
crops <- read_csv("crops.csv")
counties <- st_read("counties.geojson")
Reading layer `counties' from data source `/home/kyouma_des/GoogleDrive/AG Resume and Work Stuff/website/_posts/2021-01-19-tidy-tuesday/counties.geojson' using driver `GeoJSON'
Simple feature collection with 47 features and 7 fields
geometry type: MULTIPOLYGON
dimension: XY
bbox: xmin: 33.9105 ymin: -4.679729 xmax: 41.91056 ymax: 5.466978
geographic CRS: WGS 84
Take an initial look at the format of the data available.
head(counties)
Simple feature collection with 6 features and 7 fields
geometry type: MULTIPOLYGON
dimension: XY
bbox: xmin: 33.9105 ymin: -1.038021 xmax: 37.93755 ymax: 1.653522
geographic CRS: WGS 84
County Population Area PD PDR Factor CRI
1 BARINGO 666763 11075.3 60.20 0.71 0.00951 0.02902
2 BOMET 875689 1997.9 438.30 5.20 0.06927 0.21127
3 BUNGOMA 1670570 2206.9 756.98 8.98 0.11963 0.36487
4 BUSIA 893681 1628.4 548.81 6.51 0.08673 0.26453
5 ELGEYO/MARAKWET 454480 3049.7 149.02 1.77 0.02355 0.07183
6 EMBU 608599 2555.9 238.12 2.82 0.03763 0.11478
geometry
1 MULTIPOLYGON (((35.78413 1....
2 MULTIPOLYGON (((35.45192 -0...
3 MULTIPOLYGON (((34.62083 1....
4 MULTIPOLYGON (((33.91369 0....
5 MULTIPOLYGON (((35.5598 1.2...
6 MULTIPOLYGON (((37.55331 -0...
head(gender)
# A tibble: 6 x 5
County Male Female Intersex Total
<chr> <dbl> <dbl> <dbl> <dbl>
1 Total 23548056 24014716 1524 47564296
2 Mombasa 610257 598046 30 1208333
3 Kwale 425121 441681 18 866820
4 Kilifi 704089 749673 25 1453787
5 Tana River 158550 157391 2 315943
6 Lamu 76103 67813 4 143920
Transform country size based on population, then append to the existing dataset. The polygon transformation is conducted in county_size
.
centroids <- counties %>% st_geometry %>%
st_transform(3857) %>% st_centroid %>% st_transform(4326)
county_form <- transmute(counties, county=tools::toTitleCase(County), population=as.numeric(Population),
pop_scale=scales::rescale(population, c(0.01,1)), centroid = centroids, type='Colored'
) %>% st_set_crs(4326)
# shift sizes around centroids
county_size <- mutate(county_form, geometry = (geometry-centroid)*pop_scale+centroid,
type='Sized') %>% st_set_crs(4326)
# replicate the original shape with NA for population so the color fill can animate
county_blank <- mutate(county_form, population = NA, type='Default')
# bind together for gganimate so I can switch between fill states
county_anim <- bind_rows(county_blank, county_form, county_size) %>%
mutate(type = factor(type, c("Default",'Colored','Sized')))
How is Kenya’s population distributed across counties? Resize the polygons from 1% to 100% of original size based on the population size. Gif included at the top; this is the code that generated it.
anim <- ggplot(county_anim) +
geom_sf(aes(fill=population)) +
# animation specs
transition_states(
type, transition_length=1, state_length = 2
)+
ease_aes('cubic-in-out')+
enter_fade()+
# back to ggplot; add counties with no fill so outlines stay consistent
geom_sf(data=counties, fill=NA) +
scale_fill_steps(n.breaks=4, low="#bdc9e1", high="#045a8d", labels=scales::comma,
guide='legend', trans='log10') +
labs(title="Kenya's Counties", fill='Population') +
theme_map() +
theme(
title=element_text(size=16, color='black')
)
animate(anim, nframes = 20)
anim_save('kenya_county_size.gif', anim)
An obvious limitation to this approach is that a polygon can only go up to 100% of its original size without exceeding its original boundaries. In this map, the smallest county has the largest population, so the visual impact of the shifts is really messy. A possible cleanup strategy would be to make the original polygons to make them smaller, and so that the population-scaled polygons could increase to fit to the original boundaries.