String tip: vectorized pattern replacement

By Deependra Dhakal in R tidyverse

December 22, 2018

Example case

Suppose you have a bunch of really filthy names, which makes you puke… You can go about fixing those with the help of stringi and stringr

Lets say following character vector hosts those filthy names.

filthy <- c("Grains %", "Moisture (gm/kg)", "Plant height (cm)", "White   spaces", "White space  (filth%)")
filthy
## [1] "Grains %"              "Moisture (gm/kg)"      "Plant height (cm)"    
## [4] "White   spaces"        "White space  (filth%)"

Now to get rid of the filth use string manipulation.

filthy %>% 
  stringr::str_squish() %>% # remove any duplicate whitespaces
  stringi::stri_replace_all_fixed(pattern = c(" ", "/", "%",  "(", ")"), 
                                  replacement = c("_", "_per_", "_percent_", "_", "_"), 
                                  vectorize_all = FALSE) %>% # itemwise vector element match for search and replacement
  stringr::str_replace_all("__", "_") %>% 
  stringr::str_replace_all("_$", "")
## [1] "Grains_percent"            "Moisture_gm_per_kg"       
## [3] "Plant_height_cm"           "White_spaces"             
## [5] "White_space_filth_percent"
Posted on:
December 22, 2018
Length:
1 minute read, 131 words
Categories:
R tidyverse
Tags:
R
See Also:
Simulating genetic drift
Missing negative from the normal
Internals of Mixed Models