String tip: vectorized pattern replacement
By Deependra Dhakal in R tidyverse
December 22, 2018
Example case
Suppose you have a bunch of really filthy names, which makes you puke… You can go about fixing those with the help of stringi
and stringr
Lets say following character vector hosts those filthy names.
filthy <- c("Grains %", "Moisture (gm/kg)", "Plant height (cm)", "White spaces", "White space (filth%)")
filthy
## [1] "Grains %" "Moisture (gm/kg)" "Plant height (cm)"
## [4] "White spaces" "White space (filth%)"
Now to get rid of the filth use string manipulation.
filthy %>%
stringr::str_squish() %>% # remove any duplicate whitespaces
stringi::stri_replace_all_fixed(pattern = c(" ", "/", "%", "(", ")"),
replacement = c("_", "_per_", "_percent_", "_", "_"),
vectorize_all = FALSE) %>% # itemwise vector element match for search and replacement
stringr::str_replace_all("__", "_") %>%
stringr::str_replace_all("_$", "")
## [1] "Grains_percent" "Moisture_gm_per_kg"
## [3] "Plant_height_cm" "White_spaces"
## [5] "White_space_filth_percent"